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Abstract — We consider a power-constrained sensor network, 
consisting of multiple sensor nodes and a fusion center (FC), that 
is deployed for the purpose of estimating a common random pa- 
rameter of interest. In contrast to the distributed framework, the 
sensor nodes are allowed to update their individual observations 
by (linearly) combining observations from neighboring nodes. 
The updated observations are communicated to the FC using 
an analog amplify-and-forward modulation scheme and through 
a coherent multiple access channel. The optimal collaborative 
strategy is obtained by minimizing the cumulative transmission 
power subject to a maximum distortion constraint. For the dis- 
tributed scenario (i.e., with no observation sharing), the solution 
reduces to the power-allocation problem considered by Xiao et. 
al. Q). Collaboration among neighbors significantly improves 
power efficiency of the network in the low local-SNR regime, 
as demonstrated through an insightful example and numerical 
simulations. 



I. Introduction 

Wireless sensor networks consist of spatially distributed 
battery-powered sensors that monitor certain environmental 
conditions and often cooperate to perform specific signal pro- 
cessing tasks like detection, estimation and classification J2|- 
In this paper, we consider a network that is deployed for the 
purpose of estimating a common random parameter of interest. 
After observing noisy versions of the parameter, the sensors 
can share their observations among other neighboring nodes, 
an act referred to as collaboration in this paper (following 0). 
The observations from all the neighbors are linearly combined 
and then transmitted to the fusion center (FC) through a 
coherent MAC channel. The FC receives the noise-corrupted 
signal and makes the final inference. The schematic diagram 
of such a system is shown in Figure [T] (we will introduce the 
notations and describe each block later in Section [III). 

The individual sensor nodes are battery powered and hence 
the network, as a whole, is highly power limited. In the 
absence of a power limit, the sensors could ideally collab- 
orate with all the other nodes, make the inference in the 
network, and transmit the estimated parameter to the FC 
without any further distortion (by using infinite transmission 
power). However, in the presence of a strict power constraint, 
both collaboration and transmission have to be performed 
judiciously, so as to maximize the quality of inference at the 
FC. In this paper, we study the tradeoff between cumulative 
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Fig. 1. Sensor network performing collaborative estimation. 

transmission power and the quality of inference for a given 
collaborative neighborhood. We assume cost-free collabora- 
tion, i.e., the power required to share observations within 
the neighborhood is negligibly small compared to the power 
required to communicate with the FC. In an extended version 
of this paper, we would address the more general problem 
where collaboration incurs a finite cost. 

In the absence of collaboration, this problem is the same as 
distributed estimation, which has been extensively researched 
- both from analog 0~),||4l and digital [5|,[6| encoding per- 
spectives. When the parameter to be estimated is a scalar, 
as in our case, much of the problem formulation is similar 
to distributed beamforming in relay networks (7), (8). How- 
ever, research regarding collaborative estimation is relatively 
nascent. When the transmission channels are orthogonal and 
cost-free collaboration is possible within a fully connected 
sensor network, it has been shown in [3] that it is optimal to 
perform the inference in the network and use the best available 
channel to transmit the estimated parameter. In this paper, 
we study the optimal collaboration design for the partially 
connected network and coherent MAC channel. 

II. Problem Formulation 

We consider the scenario where the parameter of interest 
is a scalar random variable with known statistics, specifi- 
cally, Gaussian distributed with zero mean and variance rj 2 . 
The observations at the sensor nodes n = 1,2, ...,N are 
governed by the linear model x n — h n 6 + e„, where h n 
is the source attenuation and w n is the measurement noise. 
Let h = [hi, h-2, . . . , h;y] T . The measurement noise vector 
e = [ei, £2, . . . , £at] t is assumed to be zero-mean, Gaussian 
with (spatial) covariance E[ee T ] = X. Perfect knowledge of 
the observation model parameters {h n }^ =1 and S is assumed. 
In vector notation, the observation model is 

x = h0 + e, 0~7V r (O,?7 2 ),e~.A/'(O,£), (1) 
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where x = [x\, op- 
tions. 



,x N 



) T denotes the vector of observa- 



We consider an extension of the analog amplify-and-forward 
scheme as our encoding and modulation framework for com- 
munication to the fusion center. In the basic amplify-and- 
forward scheme, each node transmits a weighted version 
of its own observation, say W n x n , with resulting power 
W,jE[x^]. Such a scheme is appealing and often-used (e.g., 
E), HL 0) due to two reasons, 1) Uncoded nature: Does 
not require block coding across time and hence efficient for 
low-latency systems, 2) Optimal: For a memoryless Gaussian 
source transmitted through an AWGN channel (Figure [T] with 
TV = 1), an amplify-and-forward scheme helps achieve the 
optimal power-distortion tradeoff in an information-theoretic 
sense (see Example 2.2 in |9|). The optimality of linear coding 
has also been established iflOl for distributed estimation over a 
coherent MAC (Figure [T] without spatial collaboration) when 
the observation noise is spatially uncorrelated. 

Let the availability of collaborative links among the various 
nodes be represented by the N x N adjacency matrix (not 
necessarily symmetric) A, where A^j g {0, 1}. An entry 
Aij = 1 signifies that node j shares its observations with 
node i. Sharing of this observation is assumed to be realized 
through a reliable communication link that consumes power 
Cij, regardless of the actual value of observation. The N x N 
matrix C describes all the costs of collaboration among 
various sensors and is assumed to be known. Since each node 
is trivially connected to itself, An = 1 and Cn = 0. We denote 
the set of all A-sparse matrices as 



S A = {W e 



t,NxN 



if Ai 



0}. 



(2) 



Corresponding to an adjacency matrix A and an A-sparse ma- 
trix W, we define collaboration in the network as individual 
nodes being able to linearly combine local observations from 
other collaborating nodes, z n = J2j-A =i W n jXj. In effect, 
the network is able to achieve a one-shot spatial transformation 
W : x — > z of the form 



z = Wx, w e s A - 



(3) 



We would refer to W as the collaboration matrix. It may be 
noted that, 1) Particularization: When W is a diagonal matrix 
(equivalently, A is the identity matrix In), our collaborative 
scheme simplifies to the basic amplify-and-forward strategy 
HI, 2) Collaboration cost: Any collaboration involving W € 
Sa is achieved at the expense of (cumulative) power 

N N 

Q A = y.y,C:.rl,r (4) 

i=l j=l 

and 3) Transmission cost: The (cumulative) power required 
for transmission of encoded message z is 



P w = E e , e [z T z} =Tr W (i: + T 1 2 hh T )W 



(5) 



The transformed observations z are assumed to be transmit- 
ted to the fusion center through a coherent-MAC channel. In 
practice, a coherent MAC channel can be realized through 
transmit beamforming 11 It where sensor nodes simultane- 



ously transmit a common message (in our case, all Zk-s are 
scaled versions of a common 0) and the phases of their 
transmissions are controlled so that the signals constructively 
combine at the FC. The channel gain at node n is assumed 
to be g n and the noise of the coherent-MAC channel u is 
assumed to be a zero-mean AWGN with variance £ 2 . Perfect 
knowledge of the channel state {g n }n=i an d £ 2 is assumed. 
Let g = [<7i,<?2) • • • i9n\- The output of the coherent-MAC 
channel (or the input to the fusion center) is 



y = g T z + u, u~N(0,e)- 



(6) 



Having received y, the goal of the fusion center is to obtain 
an accurate estimate 8 of the original random parameter 9. We 
consider the mean square error (MSE) as the distortion metric 



D w ± E fl 



W 



Since the measurement model 



is (conditionally) linear and Gaussian (see ([TJ, ([3| and |6])), 

y\6 ~ Af(g T Wh6, g T WXW T g + f), (7) 

the minimum mean square estimator (MMSE) lfT2ll . 6 = 
^•e,e,u[8\y] is used as the optimum fusion rule. It is well known 
that MMSE attains the posterior Cramer-Rao lower bound, 
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Jw 



(g T Whf 



(8) 



f] £ "J g T WY.W T g + e' 

where Jw denotes the (conditional) Fisher information. It may 
be noted here that, for the centralized case, i.e., where all the 
observations x are directly available at the FC, the benchmark 
performance is, 

-l 



Jo 



Jt 



1 h. 



(9) 



The design of the the collaboration matrix W is critical 
since it affects both the power requirements and estimation 
performance of the entire application. Specifically, the fol- 
lowing quantities depend on W, 1) the resources required to 
collaborate, Qnz(w£| as described in Q, 2) the resources 
required to transmit, Pw, as described in |5]) and 3) the 
final distortion of the estimate at the FC, Dw, provided by 
dSJ. In this paper, we address the problem of designing the 
optimum collaboration matrix subject to a (cumulative) power 
constraint, 

minimize %, subject to Pw + Qnz(w) < P> (10) 

where P denotes the (cumulative) power available in the 
network. It should be noted that, in addition to a cumulative 
power constraint, there may be individual power constraints 
corresponding to the various sensor nodes. However, we do 
not address the individual power constraints in this paper and 
this issue remains a worthy topic for future research. 

Problem ( fTO) , in general, has no known globally optimal 
solution. However, for the special case when the entries of 

'Definition of operators nz(-), zero(-), and nnz(-): The operator nz : 
jjiVxiV _^ {o, \y NxN is used to specify the non-zero elements of a matrix. 
If Wij 0, then [nz(W)li,- = L else [nz(W)] 4 ■ = 0. Similarly, the 



operator zero : 1 



{o,iy 



is used to specify the zero elements of 



a matrix, [zero(W)] ij = 1- [nz(W)] i .. The operator nnz : ] 
is used to specify the number of non-zero elements of a matrix. 
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the collaboration cost matrix C are either zero or infinity, 
Cij € {0, oo}, we will show that there exists a unique 
solution for which a closed-form solution can be derived. 
Physically, this special case corresponds to the situation when 
the topology of a network is fixed (and hence not subject to 
design) and communication among neighbors are relatively 
inexpensive compared to communication with the FC. Let 
A = zero(C) denote the permitted adjacency matrix for such 
a situation. Hence, the collaboration cost vanishes, Qa = 0, 
and problem ( fT0) i simplifies to, 



minimize Dw, 

W£S A 



subject to Pw < P, 



(11) 



which is an optimization problem in nnz(A) variables. 

Since problem ( fTT) arises out of the assumption of zero- 
cost for collaboration, we would refer to ( fTT} as the ideal- 
collaborative power-allocation problem. As regards the more 
general case (problem (|T0]> for arbitrary costs C and the 
topology being subject to design), one can start from the 
distributed topology A = I, and follow a greedy algorithm 
and augment the collaborative topology with the most power- 
efficient link at each iteration. This extension is not discussed 
in this paper and is relegated to a later version of this paper. 

III. Ideal-collaborative power-allocation 

From ([8}, we note that minimizing the distortion Dw is 
equivalent to maximizing the (conditional) Fisher information 
Jw- Hence problem ( fTT} is equivalent to, 



maximize Jw, subject to Pw < P- 



(12) 



Since multiplying W by a scalar a > 1 (strictly) increases 
both Jw and Pw (and for a < 1, strictly decreases them), 
problem ( p"2} is equivalent to its converse formulation, where 
power is minimized subject to a minimum (conditional) Fisher 
information J £ (0,J ), 



minimize Pw, 

wes A 



subject to Jw > J, 



(13) 



in the sense that the optimal solutions J opt (P) (of (jT2j) 
and P opt (J) (of ( fT3} ) are inverses of one another. Moreover, 
the optimal solutions hold with active constraints (satisfying 
equalities). From |5} and (JH), problem (JT3J is further equivalent 
to, 



W £ + ifhh 1 W 



minimize Tr 
subject to g T W (jT, - hh T ) W T g + J£ 2 < 0, 



(14) 



which, on closer look, is a quadratically constrained quadratic 
program (QCQP) in L = nnz(A) variables. 

An explicit form of QCQP can be obtained from prob- 
lem ( fl4| > by concatenating the elements of W (column- 
wise, only those that are allowed to be non-zero), in w = 
[wi,W2, ■ ■ ■ ,wl] t , and accordingly transforming other con- 
stants, 

minimize w T flw 



subject to w GZG w + J£ 2 < 0, where 



(15) 
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Fig. 2. Transformations for QCQP formulation in explicit form - an example. 



w4w,y4n, S 4G, and 
V = S + r/ 2 hh T , Z = JS - hh T . 



(16) 



We illustrate the relevant transformations through an example, 
in Figure [2] with N = 4 nodes and 3 collaborating links, i.e., 
total L = 7 non-zero coefficients. Based on topology A, node 
k sends its observations to nodes Tk and receives observations 
from nodes Tk- The notations T k w and J 7 ™ similarly denote the 
respective indices of w as obtained from W . The matrix £1 is 
formed from V by copying the (sub)matrices V j= k — > SI?™ 
for k = 1, 2, . . . , N, satisfying Tr (WVW T ) = w T flw. The 
matrix G is similarly formed from vector g by copying the 
elements g-j- k — > G-fw^f. for all k, satisfying g T W — w T G. 

The solution to problem ( fl3| l (equivalently, problems ( fl2] >, 
( p~3] > and ( fl4| i) is summarized in Theorem [T] 

Theorem 1: (Power-Distortion tradeoff for Linear Coherent 
Ideal-Collaborative Estimation) Assuming £ to be positive 
definite, the tradeoff between (conditional) Fisher Information 
and (cumulative) transmission power is 

J p t (P) = h T (S + T/P^y 1 h, where 

pf = p/s 2 , r = ( g t q 1 ' il " 



which is achieved when the weights of collaboration matrix 
are 



Wopt = K n- 1 Gr(s + r/p ? ) 



h. 



(18) 



where the scalar k is such that w^ pl Sl,w opt = P. Equivalently, 
for J G (0, Jo), P opt (J) = J£, 2 ll + (J), where fi+(J) is the 
only positive solution to the generalized eigenvalue problem 
(r + fiZ)v = (note that Z is a function of J). 

Proof: See Appendix |A| ■ 

Theorem [T] is important since it shows the effect of (cu- 
mulative) transmit power and the topology on the estimation 
performance. Corresponding to the example topology in Figure 
[2] and randomly chosen system parameters h, £ and g, a 
typical power-distortion tradeoff curve is shown in Figure [3] 
(bold line). Some remarks regarding Theorem [T] are in order. 

Remark 1 (Distributed and fully connected cases): For the 
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TABLE I 

Distortion and optimal weights for low and high SNR limits. 

distributed scenario, A = I, and we have w = d\ag(W)^ 
CI = diag(diag(V)) and G = diag(g). Furthermore, when £ 
is diagonal (equivalently, when observation noise is spatially 
uncorrelated), equation (V7\ reduces to 



JV 



h 



where o r 



(19) 



precisely the result obtained in (TJ. 

For the fully connected scenario, 
vec(W), n = V (g> I, G = J (g 
following result. 

Proposition 2: (Power-distortion 
nected topology): 



A = 11 T , we have w = 
g, and subsequently the 

tradeoff for fully con- 



Tconn 
''opt 



(P) = 
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opt 



cx gh T H 



(20) 



Furthermore, distortion resulting from ( |20] i is information 
theoretically optimal. 

Proof: See Appendix [B] ■ 

The information theoretic optimality is expected since a 
fully connected network is equivalent to the centralized sce- 
nario with effective channel gain ||g||. 

Remark 2 (Limits and a lower bound): For better under- 
standing the dependence of distortion D on (cumulative) SNR 
Pj, we compute the low and high SNR limits of distortion 
(and optimal weights, upto second order Taylor series) in 
Table [I] For any topology A (and consequently T), provided a 
large (cumulative) power is available, the resultant distortion 
approaches that of the centralized case, Dq (see ([9])). In low- 
SNR situations, the distortion approaches that of the prior, 
ry 2 . Towards the goal of obtaining a simpler approximation of 
( fT7] > for both the low and high SNR regimes, we obtain the 
following result. 

Proposition 3 (Lower bound on distortion): Define, 



J+(P) 



1 

Jo 



1 



P f h T T~ 



J + (P) 
(21) 



Then, J + (P) > J opt (P) and hence L>_(P) < D opt (P). 

Proof: Follows from equation ( fl7| i and the fact that both 
S and (consequently) T are positive definite, then applying 
Lemma [5] (Appendix [Cj . ■ 
Both the high and low-SNR limits and the lower bound 

2 Definition of operators diag(-) and vec(-): While operating on a matrix, 
diag : M. NxN —> ~R N is used to extract the diagonal elements. While 
operating on a vector, diag : R — > R NxN is used to construct a matrix 
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Fig. 3. Power-distortion tradeoff from Theorem [T] 

D_ are displayed in Figure [3] From Figure [3] we verify that 
both the low and high SNR limits are quite accurate (in their 
respective regimes) and the lower bound, while accurate in 
both the limits, always satisfy D_ < D. 

Remark 3 (Decentralized computation of collaborative 
strategies): The optimal combining weights in Table [I] besides 
being accurate in the low and high-SNR regimes respectively, 
have appealing interpretations that can facilitate decentral- 
ized computation of collaborative strategies, thereby requiring 
lesser coordination with the fusion center and facilitating 
faster adaptation to dynamically changing topologies. Firstly, 
it can be shown that, w oc fl^Gh (low-SNR regime) 
corresponds to the case where each node is performing local- 
MMSE estimation. Computation of the optimal combining 
weights can hence be performed from local observation and 
covariance models only. Secondly, w cx n^GT^ x h (high- 
SNR regime) can be shown to correspond to the solution of 
a convex linearly constrained quadratic program (LCQP) with 
separable objective function, which can be efficiently solved 
in a decentralized manner lfl3l . We relegate the details to a 
future possible extension of this paper. 

Remark 4 (Closed form results for regular graphs): For 
some combinations of signal parameters, network topology 
and channel gains, the power-distortion tradeoff can be explic- 
itly derived. In Figure |4] we display a class of graphs, namely 
the K -connected directed cycle, in which each node shares its 
observations with the next K nodes. Note that K = denotes 
the distributed scenario while K = N — 1 denotes the fully 
connected scenario. 

Proposition 4: (Homogeneous and equicorrelated sensor 
network with cycle topology) Assume a collaborative sensor 
network with, 1) identical observation gains, h = HqI, 2) 
equicorrelated and homogeneous observation noise, £ = 
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by specifying only the diagonal elements, the other elements being zero. The 
vectorization operator vec : R NxN — > M. N stacks up all the elements of a 
matrix column-by-column. 
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Efficiency in power achieved through collaboration in a 50-node 
geometric graph. 



cr 2 ((l - p)I + pll T ), where p £ [0,1), 3) if-connected 
directed cycle as the neighborhood adjacency matrix A, and 
4) identical channel gains, g = gol. For such a problem 
setup, the lower bound in ( pi) is actually an equality, i.e., 
D opt (P) = D-(P), with Wl^\P) oc A and 



1 

T 



i 



h 2 
"o 



1-P 
K + l 



(22) 



Proof: See Appendix [D] ■ 
From Proposition |4] we readily infer the conditions under 
which collaboration can be beneficial. For K = 0, 1, . . . , N — 
1, let us denote by P^ K \j) the (minimum) power required 
to obtain some prespecified distortion D (J and D are related 
by ([8|). Then the (relative) power (RPS) savings obtained due 
to collaboration is (from (|22|)), 



RPS (If, J) = 1 



P 



opt 



p5 0) (j) 



opt 



1 



(23) 



which represents the gain compared to distributed scenario 
(K = 0). Firstly, we note that RPS (if, J) e [0,1) (since 
p € [0, 1) and K > 0), which shows that it is always beneficial 
to collaborate, assuming cost-free collaboration. Also, more 
(relative) power is saved when, 1) the collaboration among 
nodes increases (higher K), 2) the observation noise is less 
correlated (lower p), and 3) the local-SNR is small (smaller 
7 = ~^pr L )- When local-SNR is large, say 7 — 100, then even 
a fully connected network can provide only a power saving of 
1%. On the other hand, if the local-SNR is small, say 7 = 1, 
then a fully connected network can provide upto 50% power 
savings. 

IV. Numerical Results 

To demonstrate the (cumulative) power saved due to col- 
laboration and to investigate whether the insights obtained 
from Proposition [4] extend to more complicated scenarios, we 
consider the following simulation setup. The spatial placement 



and neighborhood structure is modeled as a Random Geomet- 
ric Graph, RGG(iV, r) [14], where sensors are uniformly dis- 
tributed over a unit square with bidirectional communication 
links present only for pairwise distances at most r, i.e., A such 
that Aij = < r ].The noise is modeled as a homogeneous 
and exponentially correlated Gaussian covariance matrix, i.e., 
£ is such that = a 2 p di j , where p £ (0,1) is indicative 
of the degree of spatial correlation. A smaller value of p 
indicates lower correlation with p — > signifying completely 
independent observations. Specifically, we consider p = 10 -3 
and p = 10~ 7 to contrast the effect of correlation (for sensor 
nodes apart by distance di.j — 0.1, the actual correlations are 
p 01 Ri 0.5 and p 01 0.2 respectively). We consider N = 50 
nodes with identical local-SNR (specifically, a 2 = 0.5, h = 1 
with rj 2 = 1 and rj 2 = 2 for two separate runs). The individual 
channel gains were generated by uniform random numbers 
in the range (0, 1]. At each instance, power was allocated to 
satisfy the pre-specified distortion performance of v + 2 D ° . We 
display the power savings obtained after collaborating through 



RGG(iV, r) topology, P^—P^L(, as a percentage of the power 
required for the distributed case, P™ t , for increasing radius of 
collaboration r, in Figure [5] We note that significant power is 
saved through collaboration for different magnitudes of local- 
SNR, rj 2 , and varying degrees of spatial correlation, p. Also, 
we observe that (relative) power savings seem to increase with 
lower spatial correlation and lower local-SNR, which were 
also the insights obtained from the simpler example considered 
in Proposition |4] 

V. Conclusion 

In this paper, we addressed the problem of collaborative 
estimation in a sensor network where sensors communicate 
with the FC using a coherent MAC channel. For the scenario 
when the collaborative topology is fixed and collaboration is 
cost-free, we obtained the optimal power-distortion tradeoff in 
closed-form by solving a QCQP problem. Through the use of 
both theoretical and numerical results, we established that col- 
laboration helps to substantially lower the power requirements 
in a network, specially in low local-SNR scenario. As future 
work, we wish to explore the collaborative estimation problem 
when the parameter to be estimated is a vector with correlated 
elements. The issue of collaboration with non-zero cost, as 
mentioned earlier, is also important. Finally, collaboration in 
the presence of individual power constraints (in addition to 
cumulative) is another topic worthy of future research. 

Appendix A 
Proof of Theorem[T] 

Note in ( fT5) l that, though fi is positive definite (since £ 
is), Z is not (in fact, Z has exactly one negative eigenvalue), 
and hence problem (jT3J is not convex. However, a QCQP 
with exactly one constraint (as in problem ( fT5] l) still satisfies 
strong dualit^ (e.g., Appendix B, lfT31 ) and hence the opti- 
mal solution to ( fT5] > satisfies the Karush-Kuhn-Tucker (KKT) 

3 The technical requirement of Slatar's constraint qualification is not dis- 
cussed, but can be shown to be satisfied 



(r) 
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conditions. Therefore, for some /i > 0, 

(n + t iGZG T )w opt = L , (KKT, ((B)) (24) 
(J L + ^- l GZG T )w 0Vl = L , (O is full rank) (25) 
(G T + //G T n _1 GZG T )itv = N 
(I N + [iG T fT x GZ)v = N ,v^ G T w opt 
(r + /iZ)w = 0^, (r is full rank, (JIT)) (26) 
(r + AiJS - nhh T )v = N , (definition of Z, ([16)) 



7»r in ((20) 



Applied to ( (3~T) , 



Also, 



opt 



(Jtv - A*(r + MJ£) _1 ft.ft )v = 



from which h 1 («S + /3ftft/ ^ft, 
this yields the expression of J" 

cx n _1 Gr(£ + r/P^ft, (from fjj)) 
cx n^GrE ^/i, (see (fJT) and ((33])) 
a n^G/t, (T otV,V'E- 1 h(xh) 
oc (V _:L ft.) cgig, {Ct^G = V~ 1 ®g) 
cx (IT 1 /*) eg. g, (V _1 ft cx Il^ft, see ((33)) (34) 

i 



(27) which implies that W] 



conn 
opt 



cx gft. T Xr 



(1 - f ih T {T + M JS)- 1 /i)(/i T «) - 0, (multiplied by h T ) Fr ° m Corollary 2.3.5 of Upl, the ^ sum-rate required to 
/(A*) - 0, (define f(p) 4 1 - pfc T (r + ji-JE) -1 *) 



h T G T w 



opt 



(28) 

is the numerator of 



where ((28) is because ft, d 

Fisher information in ((8) and hence non-zero. Note that, since 
r and X are both positive definite, /(/i) is monotonically 
decreasing for \i > 0, with /(0) = 1 and /(oo) \ 1 — Jq/J < 
(since J < Jo). Hence ((28) must have a unique positive root 
(denote it as /i + (J)). Since constraint is active at the optimal 
solution, 



W T pt GZG T w opl 



+ je 



0, 



(29) 



which, along with w^ pt Qw 



j opt — P and ((24), leads to J = 
P^//i+(J). Substituted in ((28), this leads to equation ( fT7) . 
From ((27), we readily obtain t> cx (r + /.iJH)~ 1 h, and from 
(|25ll, we obtain 



to cx ft x GZv 

= f! T 1 GTv, (since Zv cx Tv, ((26)) 

which, alongwith /J+(J) = P^/J, gives ( fT8) . 



encode a single-dimensional real-valued Gaussian source with 
variance rj 2 , observed through the vector ft and Gaussian 
observation noise with covariance X, in such a way that 
reconstruction incurs an average distortion of at most D, 
satisfies 

V 4 Jo 



where A 



(35) 



Since, for a fixed sum-power P, the sum-rate has to be lesser 
that the (centralized) capacity of the coherent MAC channel, 
i.e., R lot < C, where C = \ log(l + ||g|| 2 P ? ), we obtain 



(36) 



(D-D )(l + ifJ )- 
Replacing D by J (recall, J — ^ — ^) and after some algebra 



we obtain, 



t ^ rconn 
J — '-'opt > 



(37) 



where J™" 11 is defined in 



20) . Hence, a fully-connected 
network that performs cost-free linear collaboration achieves 
information theoretically optimal performance. 



Appendix B 
Proof of Proposition|2] 



Note that fl = V 

r = (G T n- 1 G)- 1 



I and G ■■ 
V 



I eg g. Hence, 



12 ' 



9 T 9) 



Substituting this value of T in {[7) (recall, V = E- 



(30) 



-rfhh T ), 



v 



where a = 1 , ,, <.m^ - — ,, 
matrix identity, for any a ^ 0,(3, (recall, Jo 



Y/P{)- L h = ft J (aS + (3hh T )" 1 h, (31) 



and /? = 



From the Woodbury 



ft T sr 



L ft), 



(aE + Phh r )- 1 = 
(aS + /3hh T )- 1 h = 



a a(a - 

E _1 ft 

a + /3 J ' 



-/3Jo) 



(32) 
(33) 



Appendix C 

Lemma 5 (An inequality): For any iV-dimensional vector p 
and N x N symmetric positive definite matrices A and B, 

(38, 



> 



[7 T A 2 ft. Note that 



p T {A + By 1 p - p T A L p p T B L p 

Proof: Since A, B e S++, A~^BA~^ e S++. Define 
by U and A the following eigendecomposition A~^BA 11 = 
UAU T . Hence A„ > 0,Vn. Define q 

q T q = p T A x p, 

q T A 1 q = p T B 1 p, and 

q T (I + A)- 1 q = p T (A 

Hence, to prove ( (38) , it suffices to show that 

1 1 1 

> 



(39) 



sy'p. 



V J 

L-m=\ 1+A„ 

or equivalently, with a n 

N N 



T N a 2 



1+A„ 



1 y« En=l A„ 

and 6„ A 1+A " 



N 



N 



(40) 



n=l 



Since A„ > 0,Vn, both a„ and 6„ are decreasing functions of 
A„. Hence inequality (|40) follows from the Chebyshev's (sum) 



7 



inequality (page 240, equation 1.4, 0~7|). Equality holds if and 
only if, for all indices k for which qk ^ (denote such a set 
by ixnz(q)), the eigenvalues are similar. That is, iff Afc = A, 
Vfe G ixnz(q). ■ 

Appendix D 
Proof of Proposition!!] 

To show D opt (P) = D_(P), we can show that the condition 
for equality in Lemma [5] is satisfied. However, we provide a 
simpler proof. First we will show that h is an eigenvector 
of both X and T, i.e., Y,h = Xh and Th = ph, where the 
eigenvalues A and p will be derived later. Therefore (S + 
T/P ( )h = (A + p/P s )h and hence, from ((TTJ, 



j^=h T ^ + r/p^h 



h 2 N 



(41) 



(X + p/P^) 

We next find A and p. From definitions of S and h, we directly 



have 



\ = a 2 ((l-p)+pN). 



(42) 



so that J = ^Sr 1 ^ = Let K = K + 1. From the 

transformation V in < |16) , we note that Q consists of N 
blocks of identical K x K sub-matrices. For k = 1, 2, . . . , N, 

Ji,2i ..iT t _ i a-i _iT 



(43) 



where a = cr 2 (l — p) and /3 = a 2 p + rj 2 h^. Hence fil^ = 
(a + /3K)1 L and therefore ft _1 l L = Similarly, 

from the transformation g — > G in ( [16) l, G consists of columns 
Gr™,fe = Soljf- Hence G1 N = g l L and G T 1 L = g Kl N . 
Next, /i is obtained by inverting the eigenvalue of r l ~ h, 

r -1 h = /i G T r2" 1 Gl A r 

"Or 



h g 2 K^ ] 



-h, . (44) 



From ( pT} , |42| and ( p4| ), we obtain the expression of j{ 
in ( |22] >. To show W™ t cx A, it suffices to show that 



ex n _1 Gr(S + T/P^h, (from ([18 



C(fc) 
opt 



C(fc) 
^opt 

oc rr ^Iat, (since h cx 1, SI = Al, = ^1) 
cx 1 L , (since Gl w cx 1 L , cx (45) 

which completes the proof. 
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